Method and apparatus for verifying security of authentication information extracted from a user

ABSTRACT

A method and apparatus are provided for evaluating the security of authentication information that is extracted from a user. The disclosed authentication information security analysis techniques determine whether extracted authentication information can be obtained by an attacker. The extracted authentication information might be, for example, personal identification numbers (PINs), passwords and query based passwords (questions and answers). A disclosed authentication information security analysis process employs information extraction techniques to verify that the authentication information provided by a user is not easily obtained through an online search. The authentication information security analysis process measures the security of authentication information, such as query based passwords, provided by a user. Information extraction techniques are employed to find and report relations between the proposed password and certain user information that might make the proposed password vulnerable to attack.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a continuation-in-part of U.S. patentapplication Ser. No. 10/723,416, filed Nov. 26, 2003, entitled “Methodand Apparatus for Extracting Authentication Information from a User,”incorporated by reference herein.

FIELD OF THE INVENTION

The present invention relates generally to user authenticationtechniques and more particularly, to methods and apparatus forgenerating user passwords.

BACKGROUND OF THE INVENTION

Most computers and computer networks incorporate computer securitytechniques, such as access control mechanisms, to prevent unauthorizedusers from accessing remote resources. Human authentication is theprocess of verifying the identity of a user in a computer system, oftenas a prerequisite to allowing access to resources in the system. Anumber of authentication protocols have been proposed or suggested toprevent the unauthorized access of remote resources. In one variation,each user has a password that is presumably known only to the authorizeduser and to the authenticating host. Before accessing the remoteresource, the user must provide the appropriate password, to prove hisor her authority.

Generally, a good password is easy for the user to remember, yet noteasily guessed by an attacker. In order to improve the security ofpasswords, the number of login attempts is often limited (to prevent anattacker from guessing a password) and users are often required tochange their password periodically. Some systems use simple methods suchas minimum password length, prohibition of dictionary words andtechniques to evaluate a user-selected password at the time the passwordis selected, to ensure that the password is not particularly susceptibleto being guessed. As a result, users are often prevented from usingpasswords that are easily recalled. In addition, many systems generaterandom passwords that users are required to use.

In a call center environment, users are often authenticated usingtraditional query directed authentication techniques by asking thempersonal questions, such as their social security number, date of birthor mother's maiden name. The query can be thought of as a hint to “pull”a fact from a user's long term memory. As such, the answer need not bememorized. Although convenient, traditional authentication protocolsbased on queries are not particularly secure.

U.S. patent application Ser. No. 10/723,416, entitled “Method andApparatus for Extracting Authentication Information from a User,”improves the security of such authentication protocols by extractinginformation from a user's memory that will be easily recalled by theuser during future authentication yet is hard for an attacker to guess.The information might be a little-known fact of personal relevance tothe user (such as an old telephone number) or the personal detailssurrounding a public event (such as the user's environment on Sep. 11,2001) or a private event (such as an accomplishment of the user). Usersare guided to appropriate topics and information extraction techniquesare employed to verify that the information is not easily attacked andto estimate how many bits of assurance the question and answer provide.A need exists for methods and apparatus that evaluate the security ofauthentication information that is extracted from a user. A further needexists for information extraction techniques that verify whetherextracted authentication information can be easily obtained by anattacker.

SUMMARY OF THE INVENTION

Generally, a method and apparatus are provided for evaluating thesecurity of authentication information that is extracted from a user.The disclosed authentication information security analysis techniquesdetermine whether extracted authentication information can be obtainedby an attacker. The extracted authentication information might be, forexample, personal identification numbers (PINs), passwords and querybased passwords (questions and answers).

According to one aspect of the invention, a disclosed authenticationinformation security analysis process employs information extractiontechniques to verify that the authentication information provided by auser is not easily obtained through an online search. Generally, theauthentication information security analysis process measures thesecurity of authentication information, such as query based passwords,provided by a user. Information extraction techniques are employed tofind and report relations between the proposed password and certain userinformation that might make the proposed password vulnerable to attack.

In one exemplary implementation, three exemplary rule classes areemployed to determine whether a proposed password may be obtained by anattacker. A first class of rules, referred to as “self associationrules,” determines whether a proposed answer is associated with theuser. A second class of rules, referred to as “hint association rules,”determines whether a proposed answer is associated with a proposed hintin a particular relation. For example, the information extractiontechniques performed according to the hint association rules candetermine, if there is a predefined relationship between the owner of atelephone number and the user, such as a family member (self, sibling orparent), co-author, teammate, colleague or member of the same householdor community. A third class of rules, referred to as “commonalityrules,” determines whether the proposed answer is so common that it iseasily guessed from the proposed hint.

A more complete understanding of the present invention, as well asfurther features and advantages of the present invention, will beobtained by reference to the following detailed description anddrawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a network environment in which the present inventioncan operate;

FIG. 2 is a schematic block diagram illustrating the passwordenrollment/verification server of FIG. 1 in further detail;

FIG. 3 is a sample table from an exemplary user database of FIGS. 1 and2;

FIG. 4 is a flow chart describing an exemplary implementation of anenrollment process of FIG. 2 incorporating features of the presentinvention;

FIG. 5 is a flow chart describing an exemplary implementation of averification process of FIG. 2 incorporating features of the presentinvention;

FIG. 6 is an exemplary user interface that presents a user with a set oftopics from which the user can select a given topic for which the userwill provide one or more answers;

FIG. 7 is an exemplary user interface that presents the user with a setof sub-topics from which the user can select a given topic for which theuser will provide one or more answers;

FIG. 8 is an exemplary user interface that allows a user to enter aproposed answer for evaluation;

FIG. 9 is an exemplary user interface that allows a user to enter aproposed reminder or hint associated with a particular answer forevaluation;

FIGS. 10 and 11 are exemplary dialog boxes that present a user withinformation if a proposed password is rejected;

FIG. 12 is an exemplary user interface that presents the selected answerand reminder to the user and optionally allows the user to specifywhether periodic reminders should be sent;

FIG. 13 illustrates the relationship between the user information,proposed answer and proposed hint that can be tested by the presentinvention for various relations that render the proposed passwordvulnerable to attack;

FIG. 14 illustrates a “self association rule” that determine whether aproposed answer is associated with user information;

FIG. 15 illustrates a “hint association rule” that determines whetherthe proposed answer is associated with the proposed hint in a particularrelation;

FIG. 16 illustrates a “commonality rule” that determines whether theproposed answer is easily guessed from the proposed hint;

FIG. 17 is a flow chart describing an exemplary implementation of anauthentication information security analysis process that incorporatesfeatures of the present invention;

FIG. 18 is a sample table from an exemplary authentication informationsecurity analysis rule-base that incorporates features of the presentinvention; and

FIG. 19 illustrates exemplary word counts from a search engine forcertain words that can be analyzed according to the present invention toidentify certain words that are highly correlated with certain users.

DETAILED DESCRIPTION

The present invention provides methods and apparatus that evaluate thesecurity of authentication information that is extracted from a user.The authentication information might be, for example, personalidentification numbers (PINs), passwords and query based passwords(questions and answers). According to one aspect of the invention, anauthentication information security analysis process 1700 employsinformation extraction techniques to verify that the authenticationinformation provided by a user is not easily searchable. Generally, theauthentication information security analysis process 1700 measures thesecurity of authentication information, such as query based passwords,provided by a user. The present invention assumes that theauthentication information is provided by a cooperative user trying togenerate a strong password (e.g., a proposed secret and hint in a querybased password implementation). The authentication information securityanalysis process 1700 employs information extraction techniques to findand report relations between the proposed password and certain userinformation that might make the proposed password vulnerable to attack.

While the present invention is illustrated using authenticationinformation based on numbers, such as telephone numbers, streetaddresses, post office numbers, Zip codes, dates (such as birthdays,anniversaries, and significant events), identification numbers (such asemployee, membership, or social security numbers), physical statistics(such as height or weight) or monetary amounts, the present inventionalso applies to other forms of authentication information, as would beapparent to a person of ordinary skill in the art. For example, asdiscussed below, the present invention can be applied to evaluate thesecurity of authentication information based on names, such as names ofpeople or streets, or other textual information, such as automobilelicense plate numbers. Furthermore, while the present invention isillustrated using an exemplary query based password implementation, thepresent invention also applies to implementations that employ PINs andother passwords.

The exemplary authentication scheme of the present invention works witha user to define a question having an easily remembered answer that isnot easily guessed by another person. In one implementation, a passwordenrollment/verification server 200, discussed further below inconjunction with FIG. 2, first guides a user to provide a good answerduring an enrollment phase and then to provide a corresponding goodquestion that will be used as a hint to the user during a subsequentverification phase. Generally, the user is guided to provide an answerwithin a topic area that is broad enough to apply to many users, yetnarrow enough so that a given answer can be evaluated on the basis ofhow easily the answer may be guessed, given the question or informationabout the user (or both). In addition, the topics should be selected tobe sufficiently private so the answers are hard to guess, yet not be soprivate that a user is not comfortable sharing the facts. For example,the present invention recognizes that for many users, numbers, such astelephone numbers, addresses, dates, identifying numbers or numericalfacts, or textual facts, such as names of people or streets, are easyfor a user to remember, yet are not easily guessed by an attacker. Inaddition, numbers or facts related to the personal history of the usermay be easily remembered, yet not easily discovered by others.

Information extraction techniques are employed during the enrollmentphase to verify the security of the questions and answers provided bythe user. As discussed further below in conjunction with FIGS. 13through 18, the information extraction techniques evaluate whether theprovided questions and answers can be qualitatively or quantitativelycorrelated with the user by a potential attacker. Generally, theinformation extraction techniques evaluate whether (or the extent towhich) a given answer can be correlated with a given user by performingan online or curriculum vitae search of any correlated material betweenthe user and the answer. For example, if a user selects a telephonenumber of a person, the information extraction techniques determine ifthere is a predefined relationship between the owner of the telephonenumber and the user, such as a family member (self, sibling or parent),co-author, colleague or member of the same household. If so, thistelephone number is said to be correlated with the user and isdisallowed as an answer. As another example, if a user selects thejersey number of a sports figure and the information extractiontechniques reveal that the user is a fan of the sports team on which thesports figure stars, then that selection would be disallowed. Thiscorrelation may be quantitatively weighted, such that if correlationswithin a predefined threshold are found, the answer may still beallowed, however, if many correlations exceeding the predefinedthreshold are found, then the answer is disallowed. Such correlationinformation may be implemented as one or more correlation rules that areevaluated during the enrollment phase, as discussed further below inconjunction with FIG. 4.

FIG. 1 illustrates a network environment in which the present inventioncan operate. As shown in FIG. 1, a user employing a user device 110attempts to access a remote protected resource over a network 120. Inorder to access the protected resource, such as a hardware device orbank account, the user must present an appropriate password. The userpassword is generated during an enrollment phase by a passwordenrollment/verification server 200, discussed further below inconjunction with FIG. 2. The network(s) 120 may be any combination ofwired or wireless networks, such as the Internet and the Public SwitchedTelephone Network (PSTN). The password enrollment/verification server200 may be associated, for example, with a call center or web server. Itis noted that the present invention also applies in a stand-alone mode,for example, to control access to a given personal computer. Thus, insuch an embodiment, the password enrollment/verification server 200would be integrated with the user device 110. It is also noted that thepassword generation and authentication functions performed by thepassword enrollment/verification server 200 can be performed by twodistinct computing systems.

As previously indicated, the user is guided during an enrollment phaseto provide answers that are easy for the user to remember, but are noteasily guessed by an attacker. In addition, during a verification phase,when the user attempts to access a resource that is protected using thepresent invention, the password enrollment/verification server 200challenges the user with one or more questions that the user haspreviously answered, as recorded in a user database 300, discussedfurther below in conjunction with FIG. 3.

FIG. 2 is a schematic block diagram of an exemplary passwordenrollment/verification server 200 incorporating features of the presentinvention. The password enrollment/verification server 200 may be anycomputing device, such as a personal computer, work station or server.As shown in FIG. 2, the exemplary password enrollment/verificationserver 200 includes a processor 210 and a memory 220, in addition toother conventional elements (not shown). The processor 210 operates inconjunction with the memory 220 to execute one or more softwareprograms. Such programs may be stored in memory 220 or another storagedevice accessible to the password enrollment/verification server 200 andexecuted by the processor 210 in a conventional manner.

For example, as discussed below in conjunction with FIGS. 3 through 5,the memory 220 may store a user database 300, an enrollment process 400and a verification process 500. Generally, the user database 300 recordsthe password that was generated for each enrolled user. The enrollmentprocess 400 guides the user to provide one or more answers and evaluateswhether the answers are correlated with the user. The verificationprocess 500 employs a query directed password protocol incorporatingfeatures of the present invention to authenticate a user.

In addition, as discussed below in conjunction with FIGS. 17 and 18,respectively, the memory 220 may store an authentication informationsecurity analysis process 1700 and an authentication informationsecurity analysis rule-base 1800. Generally, the authenticationinformation security analysis process 1700 employs informationextraction techniques to verify that the authentication informationprovided by a user is not easily searchable using one or more predefinedrules from the authentication information security analysis rule-base1800.

FIG. 3 is a sample table from an exemplary user database 300 of FIGS. 1and 2. The user database 300 records the query based password for eachenrolled user. As shown in FIG. 3, the user database 300 consists of aplurality of records, such as records 305-320, each associated with adifferent enrolled user. For each enrolled user, the user database 300identifies the user in field 330, as well as the password (answer) infield 340 and optionally provides an associated reinforcement (hint) infield 350. For example, the user indicated in record 305 may haveprovided the following telephone number as an answer 718-555-1212, andthe corresponding hint “Mandy's Phone Number,” where Mandy may be, forexample, a pet or a child, but not the person who is identified with thetelephone number in a directory. Generally, the user will be allowed touser the selected telephone number as a password, provided that theinformation extraction analysis does not determine that the answer iscorrelated with the user, as discussed below in conjunction with FIG. 4.

FIG. 4 is a flow chart describing an exemplary implementation of anenrollment process 400 of FIG. 2 incorporating features of the presentinvention. As previously indicated, the exemplary enrollment process 400guides the user to provide one or more answers and evaluates whether theanswers are correlated with the user. As shown in FIG. 4, a user isinitially presented with one or more topics (and optionally sub-topics)for selection during step 410. As previously indicated, the user can beguided to provide an answer within a topic area that is broad enough toapply to many users, yet narrow enough so that a given answer can beevaluated on the basis of how easily the answer may be guessed, giventhe question or information about the user (or both). In addition, thetopics should be selected to be sufficiently private so the answers arehard to guess, yet not be so private that a user is not comfortablesharing the facts. The user is instructed during step 420 to provide oneor more answers and associated reminders that are related to theselected topic.

A test is performed during step 430 to determine if the answers orreminders (or both) are correlated with the user, discussed below inconjunction with FIGS. 13 through 18. In one implementation, one or morecorrelation rules may be defined to evaluate whether a given answer iscorrelated with the user. For example, if a user selects a telephonenumber of a person, the information extraction analysis performed duringstep 430 can determine if there is a predefined relationship between theowner of the telephone number and the user, such as a family member(self, sibling or parent), co-author, colleague or member of the samehousehold (qualitative correlation rule). The analysis correlates thenumber to the person by analyzing the number of hits obtained by using asearch engine (such as Google.com or Orkut.com) where both the personand number appear on the same page. If the number of hits is higher thana chosen threshold, then a positive correlation is said to exist.Alternatively, the information extraction analysis may also usespecialized web databases such as www.anywho.com that allow retrieval ofinformation associated with a particular telephone number. The metric inthis case is a positive match between the user's answer and the matchagainst the phone entry.

If it is determined during step 430 that at least one answer or reminder(or both) can be correlated with the user, then these answers arediscarded during step 440 and the user is requested to select additionalanswers. If, however, it is determined during step 430 that the answersor reminders (or both) cannot be correlated with the user (for example,according to some predefined criteria), then a weight is assigned toeach selected question during step 450 to estimate the level ofdifficulty an attacker would have to answer the question correctly.Generally, the weights are inversely related to the probability of ananswer being chosen by a wide population of users. For instance,consider a multiple choice question regarding favorite foods, with thefollowing possible answers: 1) steak, 2) liver, 3) ice cream, 4) corn,4) chicken, 6) rutabaga. Let us say that in a sampling of thepopulation, people chose these answers in the following respectiveproportions: 1) 30%, 2) 3%, 3) 40%, 4) 10%, 4) 14%, 6) 2%. Because icecream and steak could be guessed by an attacker as more likely thanliver and rutabaga to be the answer of a user, the system gives lessweight to these more popular answers. One way to weight these answers isby the inverse of the probability, so the weights here would be: 1)3.33, 2) 33.3, 3) 2.4, 4) 10, 4) 6.6, 6) 40.

The selected questions, and corresponding weights and answers arerecorded in the user database 300 during step 460 before program controlterminates.

FIG. 5 is a flow chart describing an exemplary implementation of theverification process 500 of FIG. 2 incorporating features of the presentinvention. As previously indicated, the verification process 500 employsa query directed password protocol incorporating features of the presentinvention to authenticate a user. As shown in FIG. 5, the user initiallyidentifies himself (or herself) to the password enrollment/verificationserver 200 during step 510. During step 520, the verification process500 obtains the user password that was generated for this user duringthe enrollment phase from the user database 200. The user is challengedfor the password during step 530. The challenge may optionally includethe hint associated with the password.

A test is performed during step 540 to determine if the passwordprovided by the user matches the password obtained from the userdatabase 200. If it is determined during step 540 that the passwords donot match, then a further test is performed during step 550 to determineif the maximum number of retry attempts has been exceeded. If it isdetermined during step 550 that the maximum number of retry attempts hasnot been exceeded, then the user can optionally be presented with a hintduring step 560 before again being challenged for the password. If itwas determined during step 550 that the maximum number of retry attemptshas been exceeded, then the user is denied access during step 580.

If, however, it was determined during step 540 that the passwordprovided by the user matches the password obtained from the userdatabase 200, then the user is provided with access during step 570.

Provision of Answers Related to a Selected Topic

FIG. 6 is an exemplary user interface 600 that presents a user with aset of topics 610 (during step 410 of the enrollment process 400) fromwhich the user can select a given topic for which the user will provideone or more answers. For example, the exemplary user interface 600allows a user to select topics related to personal history, discussedbelow in conjunction with FIGS. 7 through 10, key events, discussedbelow in conjunction with FIGS. 11 through 15, personal preferences,make your own number, or a random number.

In an exemplary implementation, if a user selects the first topic(personal history) from the set of topics 610, then the user will bepresented with the user interface 700, shown in FIG. 7. FIG. 7 is anexemplary user interface 700 that presents the user with a set ofsub-topics 710 from which the user can select a given topic for whichthe user will provide one or more answers. As shown in FIG. 7, theexemplary interface 700 allows a user to provide answers that arerelated to telephone numbers, street addresses, dates, numbers fromfacts, identifying numbers, or other numbers.

In an exemplary implementation, if a user selects the first subtopic(telephone numbers) from the set of sub-topics 710, then the user willbe presented with the user interface 800, shown in FIG. 8. FIG. 8 is anexemplary user interface 800 that allows a user to enter a proposedanswer for evaluation in a field 810 and hit a button 820 to have theanswer evaluated, as discussed further below. The interface 800 mayoptionally provide a user with guidelines or suggestions for good or badanswers. For example, the interface 800 may indicate that some badchoices include the telephone number of the user or another familymember. Thus, a user can enter a candidate answer and receive feedbackabout whether the candidate answer is correlated with the user. Forexample, a reverse telephone look-up can be performed to determine ifthe telephone number is associated with the user or another personhaving one or more defined relations to the user, such as a familymember or colleague. In addition, frequently used telephone numbers,such as those associated with large corporations or institutions, suchas United Air Lines or the White House, can also be flagged asproblematic.

FIG. 9 is an exemplary user interface 900 that allows a user to enter aproposed reminder or hint associated with a particular answer in a field910 and hit a button 920 to have the reminder evaluated, as discussedbelow. Just like a proposed answer, a proposed reminder can be evaluatedusing information extraction techniques. The interface 900 mayoptionally provide a user with guidelines or suggestions for good or badreminders. For example, the interface 900 may indicate that some badchoices include the name and address of a particular person (whetheridentified explicitly by name or by a unique label that can be resolvedby an attacker, such as “my mother”). Thus, a user can enter a candidatereminder and receive feedback about whether the candidate reminder iscorrelated with the user or the answer. For example, a search can beperformed in a telephone directory to obtain the telephone number oraddress (or both) of a person identified in the proposed reminder todetermine if the identified person is correlated with the user or theanswer. In further variations, the proposed reminder may be presented bythe user at login, stored by the user or memorized by the user.

It is noted that the proposed answer entered by the user using theinterface 800 of FIG. 8 can optionally be evaluated and confirmed beforethe user is requested to enter a proposed reminder using the interface900 of FIG. 9. If the user clicks on the button 820, or 920 to have theproposed answer or reminder evaluated, the authentication informationsecurity analysis process 1700 will be initiated to evaluate theproposed answer, reminder or both.

FIG. 10 is an exemplary dialog box 1000 that can be presented to a userif the authentication information security analysis process 1700determines that the proposed telephone number might be vulnerable toattack. The interface 1000 can optionally include a button 1010 thatallows the user to obtain additional information regarding the reasonswhy the proposed telephone number was rejected.

FIG. 11 illustrates an alternate embodiment for a dialog box 1100 thatcan present additional information to a user if the authenticationinformation security analysis process 1700 determines that the proposedtelephone number might be vulnerable to attack. The interface 1100 canoptionally include one or more buttons 1110-1112 that allows the user toselectively obtain additional information for each of the reasons whythe proposed telephone number was rejected. For example, a first button1110 can provide additional information providing details of a directorysearch indicating that a proposed telephone number is associated with aperson of a given relation (e.g., a family member). A second button 1111can provide additional information providing details of a directorysearch indicating that a proposed telephone number indicates strongassociations between the user and the person associated with theproposed telephone number (e.g., a family member). Finally, a thirdbutton 1112 can provide additional information providing details of adirectory search indicating that the person associated with thetelephone number was in the “top N” results for a web search for thename of the user.

Upon a successful evaluation by the authentication information securityanalysis process 1700, an exemplary user interface 1200, shown in FIG.12, can present the selected answer and reminder to the user andoptionally allow the user to specify, for example, upon completion of anenrollment, whether reminders should be sent. The exemplary interface1200 presents the user with an answer in a field 1210 and thecorresponding reminder in a field 1220. The user can optionally specifywhether any reminders should be sent by electronic mail or telephone,and the frequency of such reminders, using fields 1230, 1240,respectively.

Verifying Security of Extracted Authentication Information

As previously indicated, the present invention provides methods andapparatus that evaluate the security of authentication informationextracted from a user. The present invention employs informationextraction techniques to find and report relations between the proposedpassword and certain user information that might make the proposedpassword vulnerable to attack. In the exemplary query based passwordimplementation, a query based password is comprised of a proposed hintand a proposed answer. Thus, the present invention will assess andreport any relations between the proposed hint, proposed answer and userinformation.

FIG. 13 illustrates the relationship between the user information 1310,proposed answer 1320 and proposed hint 1320 that can be tested by thepresent invention for various relations that render the proposedpassword vulnerable to attack. As shown in FIG. 13, an exemplary user,John Smith, has associated user information 1310, that may be obtained,for example, from the user database 300 that records a passwordgenerated for each enrolled user and other user information, such as anaddress (not shown). The authentication information security analysisprocess 1700 can optionally interact with the user to collect additionaluser information 1310. The user has interacted with the passwordenrollment/verification server 200 of FIG. 2 using the interfaces 800,900 of FIGS. 8 and 9, to enter a proposed answer 1320 and a proposedhint (reminder) 1330.

The types of hints 1330 and user background information 1310 arestrongly related to the kind of secret that are employed forauthentication. For example, when the spectrum of hints are highlyconstrained, the searches performed to asses the hint are easier. In animplementation where the user has greater flexibility in entering hints(i.e., where the user is allowed to be more expressive and thus thehints may be more useful), however, the searches become morechallenging. Similarly, when the authentication information securityanalysis process 1700 has richer user background information 1310available, the security assessment can be more comprehensive, but takesgreater time.

In the exemplary embodiment, where telephone numbers are used as querybased passwords, the telephone number of the user can be obtained from anumber of databases, including web sites, such as anywho.com, thatprovide a telephone number given name or address information (or both),or can provide a name or address information (or both) given a telephonenumber. In addition, depending on the application, proprietarydatabases, such as an employee directory, may be available to provideadditional information.

As discussed further below in conjunction with FIGS. 17 and 18, theexemplary authentication information security analysis process 1700employs an authentication information security analysis rule-base 1800,shown in FIG. 18. The rules stored in the authentication informationsecurity analysis rule-base 1800 provide a flexible mechanism for theauthentication information security analysis process 1700 to assessvarious predefined security vulnerabilities associated with each rule.

In the illustrative embodiment, the rules stored in the authenticationinformation security analysis rule-base 1800 may generally be classifiedinto one of three exemplary rule classes. A first class of rules,referred to as “self association rules,” illustrated in FIG. 14,determine whether the proposed answer 1320 is associated with the userinformation 1310. As shown in FIG. 14, the exemplary self associationrule 1400 determines whether the user 1310 is the owner of the telephonenumber that was entered as a proposed answer 1320. If so, this telephonenumber is said to be correlated with the user and is disallowed as ananswer.

A second class of rules, referred to as “hint association rules,”illustrated in FIG. 15, determine whether the proposed answer 1320 isassociated with the proposed hint 1330 in a particular relation. Asshown in FIG. 15, the exemplary hint association rule 1500 determineswhether the person associated with the proposed hint 1330 is in aparticular relation with the user 1310. For example, the informationextraction techniques performed according to the hint association rulescan determine, for example, if there is a predefined relationshipbetween the owner of the telephone number and the user, such as a familymember (self, sibling or parent), co-author, colleague or member of thesame household. If so, this telephone number is said to be correlatedwith the user and is disallowed as an answer. For example, the hintassociation rule 1500 can determine whether the user 1310 is related tothe owner of the telephone number that was entered as a proposed answer1320. For example, the proposed answer can be searched using a reversetelephone lookup, such as anywho.com, and the resulting name can becompared to the user name for a family relation (i.e., whether the ownerof the telephone number and the user have the same last name) orneighbor relation (i.e., whether the owner of the telephone number andthe user live on the same street or within a specified distance).

A third class of rules, referred to as “commonality rules,” illustratedin FIG. 16, determine whether the proposed answer 1320 is easily guessedfrom the proposed hint 1330. As shown in FIG. 16, the exemplarycommonality rule 1600 determines whether a popular business entity isthe owner of the telephone number that was entered as a proposed answer1320. If so, this telephone number is said to be easily guessed and isdisallowed as an answer. Examples of information that is easily guessedincludes the height of Mount Fuji, the telephone number of a popularbusiness and the number on a jersey of a popular professional athlete.

As previously indicated, during a verification process, the user ispresented with a reminder or hint that the user provided during anenrollment process. The user must then enter the corresponding answerthat the user provided during enrollment, in order to obtain access tothe requested device or resource. Thus, it can be assumed that anattacker has access to the user information 1310 and reminder 1330. Thepresent invention employs information extraction techniques to simulatethe activities of an attacker and try to determine whether the proposedanswer 1320 can be easily obtained from either the user information 1310or reminder 1330. If the present invention can find a correlationthrough an online search between either the user information or thereminder and the proposed answer, the proposed answer should berejected. The online search may be performed, for example, using asearch engine, such as Google.com.

For example, the online search may employ a query comprised of a givenuser name and proposed answer. The documents that satisfy the query canbe evaluated to determine if there is an association between the username and answer.

FIG. 17 is a flow chart describing an exemplary implementation of anauthentication information security analysis process 1700 that employsinformation extraction techniques to verify that the authenticationinformation provided by a user is not easily searchable. Generally, theauthentication information security analysis process 1700 measures thesecurity of authentication information, such as query based passwords,provided by a user. One challenge, of course is how to measure the“security” of a question in a query based password implementation.

The authentication information security analysis process 1700 isillustrated using authentication information based on telephone numbers.As previously indicated, the authentication information securityanalysis process 1700 can be extended to assess the security of othernumbers, such as street addresses, post office numbers, Zip codes, dates(such as birthdays, anniversaries, and significant events)identification numbers (such as employee or membership numbers),physical statistics (such as height or weight) or monetary amounts, aswell as other forms of authentication information, as would be apparentto a person of ordinary skill in the art.

As shown in FIG. 17, the authentication information security analysisprocess 1700 initially loads the authentication information securityanalysis rule-base 1800, discussed further below in conjunction withFIG. 18, during step 1705. As previously indicated, the rules stored inthe authentication information security analysis rule-base 1800 providea flexible mechanism for the authentication information securityanalysis process 1700 to assess various predefined securityvulnerabilities associated with each rule. In the illustrativeembodiment, the authentication information security analysis process1700 loads three exemplary rules from the authentication informationsecurity analysis rule-base 1800 that are tested by the authenticationinformation security analysis process 1700. The three exemplary rulesare each associated with one of the three exemplary rule classes. Thus,exemplary self association, hint association and commonlity rules aretested during steps 1710, 1720 and 1730, respectively. Theauthentication information security analysis process 1700 can assessadditional rules from the authentication information security analysisrule-base 1800, discussed below, as would be apparent to a person ofordinary skill.

A self association test is performed during step 1710 to determinewhether the proposed answer is associated directly with the user. If itis determined during step 1710 that the proposed answer is associateddirectly with user, the proposed answer is said to be correlated withthe user and is disallowed as an answer. Program control thus proceedsto step 1750, discussed below.

A hint association test is performed during step 1720 to determinewhether the proposed answer is associated with the proposed hint in aparticular relation, such as a family member (self, sibling or parent),co-author, teammates, colleagues or members of the same household orcommunity. If it is determined during step 1720 that the proposed answeris associated with the proposed hint, the proposed hint is said to becorrelated with the user and is disallowed as an answer. Program controlthus proceeds to step 1750, discussed below.

A commonality test is performed during step 1730 to determine whetherthe proposed answer is easily guessed from the proposed hint. If it isdetermined during step 1730 that the proposed answer is easily guessedfrom the proposed hint, the proposed answer and proposed hint aredisallowed as a query based passwords. Program control thus proceeds tostep 1750, discussed below.

If each of the exemplary tests performed during steps 1710, 1720 and1730 pass, program control will proceed to step 1740 where the proposedanswer and/or hint are accepted. Upon a successful evaluation by theauthentication information security analysis process 1700, the exemplaryuser interface 1200 of FIG. 12 can present the selected answer andreminder to the user and optionally allow the user to specify whetherreminders should be sent.

If any of the exemplary tests performed during steps 1710, 1720 and 1730fail, program control will proceed to step 1750 where the proposedanswer and/or hint are rejected. When the authentication informationsecurity analysis process 1700 determines that the proposed answerand/or hint might be vulnerable to attack, one of the exemplary userinterfaces 1000 or 1100 of FIGS. 10 and 11 can present the user withadditional information regarding the reasons why the proposed passwordwas rejected.

Improving Search Results

The present invention employs information extraction techniques tosimulate the activities of an attacker and try to determine whether theproposed answer 1320 can be easily obtained from either the userinformation 1310 or reminder 1330. If the present invention can find acorrelation through an online search between either the user informationor the reminder and the proposed answer, the proposed answer should berejected. The online search may be performed, for example, using asearch engine, such as Google.com.

As with any online search, the accuracy of the authenticationinformation security analysis process 1700 is impaired by false hits(i.e., unrelated results) in the results of the query. The false hitscause the authentication information security analysis process 1700 tounnecessarily reject reasonable query based passwords. The securityassessment of the present invention can be improved by usingmeta-searching, local proximity techniques, number classification or acombination of the foregoing to reduce the number of false hits.

A meta-search engine may optionally be employed to reduce the number offalse hits. A meta-search employs a number of search engines in paralleland compares the results from each search engine. Generally, the moresearch engines a given web page gets a hit from, the more reliable theweb page will be in terms of carrying the user information. An exemplarymeta-search engine is Dogpile.com that provides a collection of 16search engines, such as Google, Overture, Ask Jeeves, and About. WhileGoogle is generally perceived to retrieve the most relevant results, themeta search engine helps to reduce the number of false hits.

Local proximity techniques can optionally be employed to reduce thenumber of false hits. Local proximity techniques can be employed toensure that the hits from a search are in the proper context. Forexample, local proximity techniques can be employed to ensure thatsearch results corresponding to a proposed telephone number are actuallytelephone numbers. A telephone number is typically comprised of an areacode (first three digits), a prefix (next three digits) and a telephonenumber portion (last four digits). The area code, prefix and telephonenumber should be treated as separate tokens in the query to cover thevarious potential formats of a telephone number. For example, a web pagethat contains “212.998.3365” will be missed by a query specified as“212-998-3365” (for exact phrase match). However, if the variouscomponents are searched separately, each set of digits must besufficiently close to each other to conform to a telephone number. Falsehits will occur when the numbers occur separately within the same page.In one implementation, the local proximity technique can calculate aminimum average distance of the numbers and reject a given web page ifthe average distance is greater than a defined threshold.

Number classification techniques can also optionally be employed toreduce the number of false hits. Number classification techniques can beemployed to ensure that the hits from a search are due to the propertype of numbers (or other information). For example, in the exemplarytelephone number implementation, the number classification techniquescan be employed to ensure that the hits from a search are due totelephone numbers. The present invention recognizes that the numbers(area code, prefix, telephone number) hit by mistake tend to have adifferent usage, such as publication page numbers, identificationnumbers or portions thereof, or dates.

The automatic prediction of the usage of numbers can be used as acriteria for filtering the search results. In one exemplaryimplementation, number classification techniques are employed todistinguish between telephone numbers and non-phone numbers (such asaddresses, publication pages or dates).

FIG. 18 is a sample table from an exemplary authentication informationsecurity analysis rule-base 1800. As shown in FIG. 18, the exemplaryauthentication information security analysis rule-base 1800 includes aplurality of rows 1810-1820, each associated with a different rule. Foreach rule, identified by a rule name in field 1830, the authenticationinformation security analysis rule-base 1800 indicates the type of rulein field 1840, e.g., self association, hint association or commonalityrule, and the rule conditions in field 1850. As described above inconjunction with FIG. 17, the exemplary rules in the authenticationinformation security analysis rule-base 1800 are tested by theauthentication information security analysis process 1700.

For example, the “user telephone number” rule associated with record1810 determines whether the user is the record owner of the proposedtelephone number. The “word association” rule associated with record1811 determines whether the user is strongly associated with a word thatis presented as the proposed password. The “word association” rulerecognizes that some words are easily attacked. For example, Author Amay have written a book about compilers, Author B may have written abook containing “programming pearls” and Author C may have written abook (or work in an area) about image analysis. Thus, for each author, asearch engine may identify the word counts 1900 for certain words, asshown in FIG. 19. Thus, certain words are highly correlated with certainusers. For example, for Author A, there is a high correlation with theword “compilers” that is not found for authors B or C. Thus, the word“compiler” should not be allowed for author A. The word association rulecan employ a percentage cutoff (e.g., if the word count exceeds X %, theword may not be employed as a password) or employ tests based onstatistical significance.

The “Hint Related to User” rule associated with record 1812 determineswhether the person associated with a proposed hint is in a particularrelation with the user. For example, the “Hint Related to User” rule maydetermine if there is a predefined relationship between the owner of thetelephone number and the user, such as a family member (self, sibling orparent), co-author, colleague or member of the same household. If so,this telephone number is said to be correlated with the user and isdisallowed as an answer. The “Hint Related to User” rule 1812 can alsoencompass relationships that are detected indirectly. For example, aquery based on the hint and user information may reveal that the hint isa childhood friend of the user. A threshold can be defined based on thenumber of associations between the user and the name associated with thehint.

The Common Telephone Number rule associated with record 1820 determineswhether a popular business entity is the owner of a proposed telephonenumber. If so, this telephone number is said to be too easily guessedand is disallowed as an answer. It is noted that an attacker may alwaystry the “top N” most popular telephone numbers for every user, and thesenumbers should be excluded as passwords. Additional commonality rulescan assess whether the names used as proposed passwords are too common.For example, a name can be analyzed to determine how common a name is ingeneral and/or in a given context. It is noted that a name such as“Smith” may be more common than “Singh” in some places, but the oppositeis true in other places. In addition, commonality rules can assesswhether the association between a proposed hint and password is toostrong. For example, a search engine may indicate that the word count(in thousands) for “Columbus” and “1492” may be very high, relative toother potential years (any year other than 1492). Similar searches canbe created to search for other popular associations, including commontelephone numbers, historical dates, jersey numbers for athletes andtext (e.g., for the proposed hint “first president” and password“GeorgeWashington”).

Finally, the search results for the user information 1310, proposedanswer 1320 and proposed hint 1330 can be used to assign a securityscore to the proposed password, such as the hint/answer pair in a querybased password implementation. For example, the search performed for the“word association” rule can be easily extended to assess a score for the(user, keyword) pairs. Similarly, a low security score can be assessedto common names, while higher scores can be assessed to names that aredetermined to be more rare. A threshold can optionally be assigned todetermine whether the determined security score is sufficient to accepta proposed password.

System and Article of Manufacture Details

As is known in the art, the methods and apparatus discussed herein maybe distributed as an article of manufacture that itself comprises acomputer readable medium having computer readable code means embodiedthereon. The computer readable program code means is operable, inconjunction with a computer system, to carry out all or some of thesteps to perform the methods or create the apparatuses discussed herein.The computer readable medium may be a recordable medium (e.g., floppydisks, hard drives, compact disks, or memory cards) or may be atransmission medium (e.g., a network comprising fiber-optics, theworld-wide web, cables, or a wireless channel using time-divisionmultiple access, code-division multiple access, or other radio-frequencychannel). Any medium known or developed that can store informationsuitable for use with a computer system may be used. Thecomputer-readable code means is any mechanism for allowing a computer toread instructions and data, such as magnetic variations on a magneticmedia or height variations on the surface of a compact disk.

The computer systems and servers described herein each contain a memorythat will configure associated processors to implement the methods,steps, and functions disclosed herein. The memories could be distributedor local and the processors could be distributed or singular. Thememories could be implemented as an electrical, magnetic or opticalmemory, or any combination of these or other types of storage devices.Moreover, the term “memory” should be construed broadly enough toencompass any information able to be read from or written to an addressin the addressable space accessed by an associated processor. With thisdefinition, information on a network is still within a memory becausethe associated processor can retrieve the information from the network.

It is to be understood that the embodiments and variations shown anddescribed herein are merely illustrative of the principles of thisinvention and that various modifications may be implemented by thoseskilled in the art without departing from the scope and spirit of theinvention. For example, while the invention has been illustrated usingtelephone numbers as query based passwords, the authenticationinformation might also be, for example, personal identification numbers(PINs) or other passwords based on dates or street addresses (includingZip codes and post office boxes).

In a date implementation, the proposed dates can be evaluated forrelation to general, well-known dates, such as Jul. 4, 1776 (741776) orobtainable user-related dates, such as birthdays or anniversaries. Toimprove the search results for passwords based on dates, a dateclassification scheme can be employed, in a similar manner to thetelephone number scheme described above.

In a street address implementation, the proposed addresses (or portionsthereof) can be evaluated for relation to general, well-known addresses,such as The White House, 1600 Pennsylvania Avenue NW, Washington, D.C.20500, or obtainable user-related addresses, such as address of home orbusiness. To improve the search results for passwords based onaddresses, an address classification scheme can be employed, in asimilar manner to the telephone number scheme described above.

1. A method for evaluating a password proposed by a user, comprising:receiving said proposed password from said user; and ensuring that acorrelation between said user and said proposed password does notviolate one or more predefined correlation rules.
 2. The method of claim1, wherein said one or more predefined correlation rules evaluatewhether that said proposed password can be qualitatively correlated withsaid user.
 3. The method of claim 1, wherein said one or more predefinedcorrelation rules evaluate whether said proposed password can bequantitatively correlated with said user.
 4. The method of claim 1,wherein said proposed password is comprised of a proposed answer and aproposed hint and wherein said one or more predefined correlation rulesevaluate whether said proposed answer can be correlated with saidproposed hint in a particular relation.
 5. The method of claim 4,wherein said particular relation is selected from the group consistingessentially of: self, family member, co-author, teammate, colleague,neighbor, community member or household member.
 6. The method of claim1, wherein said proposed password is comprised of a proposed answer anda proposed hint and wherein said one or more predefined correlationrules evaluate whether said proposed answer can be obtained from saidproposed hint.
 7. The method of claim 1, wherein said proposed passwordis an identifying number.
 8. The method of claim 7, wherein said one ormore predefined correlation rules evaluate whether said identifyingnumber identifies a person in a particular relationship to said user. 9.The method of claim 7, wherein said one or more predefined correlationrules evaluate whether said identifying number is a top N most commonlyused identifying number.
 10. The method of claim 7, wherein said one ormore predefined correlation rules evaluate whether said identifyingnumber identifies a top N commercial entity.
 11. The method of claim 7,wherein said one or more predefined correlation rules evaluate whethersaid identifying number identifies said user.
 12. The method of claim 7,wherein said identifying number is a portion of a telephone number. 13.The method of claim 7, wherein said identifying number is a portion ofan address.
 14. The method of claim 7, wherein said identifying numberis a portion of social security number.
 15. The method of claim 1,wherein said proposed password is a word.
 16. The method of claim 15,wherein said one or more predefined correlation rules evaluate whether acorrelation between said word and said user exceeds a predefinedthreshold.
 17. The method of claim 1, wherein said correlation isdetermined by performing a meta-search.
 18. The method of claim 1,wherein said step of ensuring a correlation further comprises the stepof performing a meta-search.
 19. The method of claim 1, wherein saidstep of ensuring a correlation further comprises the step of performinga local proximity evaluation.
 20. The method of claim 1, wherein saidstep of ensuring a correlation further comprises the step of performinga number classification.
 21. An apparatus for evaluating a passwordproposed by a user, comprising: a memory; and at least one processor,coupled to the memory, operative to: receive said proposed password fromsaid user; and evaluate whether a correlation between said user and saidproposed password violates one or more predefined correlation rules. 22.The apparatus of claim 21, wherein said one or more predefinedcorrelation rules evaluate whether said proposed password can becorrelated with said user.
 23. The apparatus of claim 21, wherein saidproposed password is comprised of a proposed answer and a proposed hintand wherein said one or more predefined correlation rules evaluatewhether said proposed answer can be correlated with said proposed hintin a particular relation.
 24. The apparatus of claim 21, wherein saidproposed password is comprised of a proposed answer and a proposed hintand wherein said one or more predefined correlation rules evaluatewhether said proposed answer can be obtained from said proposed hint.25. The apparatus of claim 21, wherein said proposed password is anidentifying number.
 26. The apparatus of claim 25, wherein said one ormore predefined correlation rules evaluate whether said identifyingnumber identifies a person in a particular relationship to said user.27. An article of manufacture for evaluating a password proposed by auser, comprising a machine readable medium containing one or moreprograms which when executed implement the steps of: receive saidproposed password from said user; and evaluate whether a correlationbetween said user and said proposed password violates one or morepredefined correlation rules.